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Abstract 

Grade of membership (GoM) analysis was introduced in 1974 |7| as a 
means of analyzing multivariate categorical data. Since then, it has been 
successfully applied to many problems. The primary goal of GoM anal- 
ysis is to derive properties of individuals based on results of multivariate 
measurements; such properties are given in the form of the expectations 
of a hidden random variable (state of an individual) conditional on the 
result of observations. 

In this article, we present a new perspective for the GoM model, based 
on considering distribution laws of observed random variables as realiza- 
tions of another random variable. It happens that some moments of this 
new random variable are directly estimable from observations. Our ap- 
proach allows us to establish a number of important relations between 
estimable moments and values of interest, which, in turn, provides a basis 
for a new numerical procedure. 

Keywords: Grade of membership analysis, latent structure analysis, 
multivariate categorical data, linear regression, multidimensional distri- 
bution. 

AMS 2000 subject classifications: Primary 62H12; secondary 62J99. 

1 Introduction 

The grade of membership (GoM) analysis was initially introduced in [7j; the 
term "Grade of Membership" is due to this article. 

GoM considers J discrete measurements on each individual, represented by 
random variables Xi, . . . , Xj, with the set of outcomes of j^^ measurement being 

*This research was supported by grants from National Institute of Aging. 



1 



2 



M. Kovtun, I. Akushevich, K. G. Manton, and H. D. ToUey 



The goal of GoM analysis is to derive some properties of an individual based 
on results of measurements. We refer to this (general and informal) specification 
of goals as the General GoM Problem (GGP.) As GGP is a general concept, there 
may be many different but reasonable answers to the problem. The present ar- 
ticle proposes one possible approach to GGP, which leads to notable theoretical 
results and allows construction of a novel numerical procedure. 

Having GGP as its primary goal, GoM differs from many other statistical 
methods, whose goal is to discover some properties of a population. For example, 
in estimation of voting results the most interesting fact is how many people will 
vote for, or against (a candidate or an issue), and it does not matter how a 
particular individual votes. In contrast, in making a medical diagnosis the 
health of a particular individual is of interest, and it does not matter (for a 
particular diagnosis) how prevalent a particular health state is in a population. 

Mathematically, one possibility to express GGP is to assume that there 
exists a hidden continuous random variable G representing knowledge about 
individuals derivable from observations (in the diagnostic example, it is the 
health state of an individual.) Now one is interested in what might be said 
about value of G based on observed values of Xi, . . . ,Xj. More specifically, 
values of interest are expectations of G conditional on values of random variables 
Xi,...,Xj,£iG\Xi=x,,..-,Xj = xj). 

Considering a continuous hidden random variables resembles latent structure 
analysis in general, and latent trait analysis in particular (see JJI^IE].) The 
connection between GoM and latent structure analysis was mentioned in the 
literature (0); see more details in 0.) We prefer to keep the name "grade of 
membership analysis" because: (a) its primary goal differs from that of the 
latent structure analysis, and (b) GoM uses a proprietary technique and is 
based on special facts that are not used in latent structure analysis. However, 
we believe that techniques developed in the present article and results obtained 
here might benefit the development of latent structure analysis. 

The main result of the present article, contained in section [3 is that the 
values of interest (i.e. conditional expectations and conditional variances) are 
solution of system and that under modest conditions, only values of interest 
are solutions of this system. Furthermore, as coroUarv 17.61 shows, the system 
(|36() can be solved by two-step process, every step of which consists of solving 
problem of linear algebra. 

GoM analysis (as well as many flavors of latent structure analysis) employs 
an assumption that the problem under consideration has lower dimensionality 
than observed data. Our theorem 17. 31 and its corollary gives a way to estimate 
this dimensionality directly (which usually presents a substantual problem is 
such kind of analysis.) 

An additional advantage of our approach is that it not only establishes a 
way to estimate values of interest, but also provides a ground for evaluation of 
confidence intervals (not addressed in the present article.) 

The rest of the article is organized as follows. 

In section |31 we mathematically formulate the problem and define related 
notions. The central idea here (which is crucial for further results) is to consider 
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individual distribution laws as realizations of another random variable, p. We 
show that initial data are sufficient to estimate a set of mixed moments of this 
distribution up to order J (the number of measurements.) 

In section 0] we consider the GoM problem as a problem of finding a low-di- 
mensional distribution and obtain basic corollaries of this hypothesis. 

In section |31 we consider a hypothesis that there exists a linear regression of 
observed random variables Xj on hidden random variable G. We show that this 
hypothesis is essentially equivalent to the one considered in previous section. 

In section |B1 we establish relations between distributions and moments of /3 
and G, and find transformation laws for changing their basis. The main result 
of this section is equation 1)29(1 . 

In section [71 we consider a system of equations (|36|) . We show that values of 
interest are always solutions of this system, and we establish sufficient conditions 
under which the system ((36|l has only such solutions. 

In section|Slwe outline a numerical procedure for estimating values of interest 
and discuss its properties. 

2 Preliminaries 

2.1 Notation 

Z is the set of integers, and M is the set of reals. Z+ and M+ are subsets 
of positive, and and are subsets of nonnegative, integers and reals, 
respectively. 

For m, n G Z, [r7i..n] denotes the set of integers between m and n: [■m..n\ — 
{zE'Z\m<z< n}. If m > n, [r7i..?i] = 0. 

R" is n-dimcnsional linear space over reals, and S" is a (n — l)-dimcnsional 
unit simplex in R", §" ^ {xeW \xi>0 and J2^x^ = '^}■ 

For a linear subspace Q C R", dim{Q) denotes its dimension. 

For x^, . . . € M", Lin(a:^, . . . , x^) denotes a linear subspace of R" spanned 
by x^, . . . , a;^, and rank(a::^, . . . , a;^) denotes a rank of system of vectors x^, . . . ,xP 
(thus, rank(a;"'^, . . . , x'p) = dim(Lin(a;"'^, . . . , x^)).) 

For a e R (or a G Z) and i G denotes a vector from R" (Z", 

respectively) with component equal a and all other components equal 0. 
Dimensionality of ai will be clear from context. 

2.2 Support of measures 

We consider only probabilistic measures defined on cr-algebra of Borel sets of 
R"; a measure ^ is a probabilistic measure, if //(M") = 1. 

A support of measure /i is a closed set A C M" such that fJ,{A) = 1. We do 
not require a minimality of a support: if ^ is a support of /i and A <Z A' , A' is 
closed, then A' also is a support of /i. 

We use Supp(/i) to denote the set of all supports of ^. Thus, A G Supp(/i) 
means is a support of /i." Note that A G Supp(/i) implies that A is closed. 
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2.3 Indexing contingency tables and related objects 

We need a way for indexing cells in a contingency table and for other objects 
having similar structure. 

A contingency table for a set of J discrete measurements, with Lj possible 
outcomes for measurement j, is a J-dimensional tabic having Lj + 1 cells in 
dimension j. Index for j^^ dimension ranges from to Lj. 

More formally, let jC^ = {{h, \lj € Z+} and £^ = {{h, \ 

lj e 2+°}, i.e. sets of J-dimensional vectors with positive and, respectively, 
nonnegative integer components. There is a one-to-one correspondence between 
sets of J discrete measurements and vectors in Coo- a vector L = {Li, . . . , Lj) 
describes a sot of J measurements, in which measurement j has Lj outcomes. 

For every L G Coo, let £l = {i & Coo \ (j < Lj} and = {i e \ ij < 
Lj}. If L defines a set of measurements, £l is a set of all possible outcomes 
of these measurements, and jC^ is a complete set of indices for the contingency 
table. In addition, for every J C [1..J], let = {£ e \ £j = ^ j G J}. 
The set J indicates measurements that we exclude from consideration, and 
vector i e contains results of all measurements except those listed in J. 
Note that £f ' = jCl and Cl = Uj-c[i..j]£^'. 

Vector £' G may be considered as describing a family of outcomes {£ G 
\ £j = £'j for j ^ J}. Abusing notation, we will also use £' to denote this 

family. More generally, we write £' G £" for £' G ' and £" G ' whenever 
e'j = £'J for all j ^ J" (note £' G £" is possible only when j' C j".) For 

i e let ^'^'^ be a vector from such that £f^ = £j for all j ^ j' . 

We always have £ G £^'^'K We write £^\ t^''^''\ etc. instead oi t^^^^\ £^'^^''^-'^\ 
etc., respectively. 

Let also set \L\ = Lj and \L*\ = Y\. Lj. 

We always assume that the set of our measurements is described by a vector 
L. We drop index L in notations Cl and C]^ if it does not create an ambiguity. 

A contingency table may be constructed for any sample by putting in the cell 
with index £ the number of individuals who (a) have outcome £j for measurement 
j if £j ^ 0; and (b) have arbitrary outcomes for all other measurements. Let 
Ni be a value in i'*'^ cell of contingency table. The usual summation rule for 
contingency tables in our notation is: for any J' (Z J (Z [L.J].^ £ ^ J , = 
"Ylit'^j ■ I'ei-^^' ■ ^ote that N = A^(o....,o) is the sample size. 

A frequency table is obtained from a contingency table by dividing the value 
in each cell by N. We use fe to denote a value of cell of frequency table. 
The above summation rule is applicable to frequency tables as well. 

3 The Problem 

We consider a population of a potentially infinite number of individuals, every 
individual being subject to J measurements with discrete outcomes. With- 
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out loss of generality, we may assume that outcomes of j**^ measurement are 
{!,..., Lj}. 

The results of measurements on individual i is a random vector X' = 

which takes values in Cl- Such a random vector is described 
by a |L|-dimensional vector of probabilities f3'^ = (j G [l--"/]; and for 

every j, I e where = Pr(Xj = /). 

These vectors of probabilities /3' may themselves be considered as realizations 
of a random vector /?, with a distribution described by probabilistic measure 



on mI-'^I 



We start with elementary properties, which may be directly derived from 
definitions. 

As fiji are probabilities, they satisfy 

(a) Pji>0 (b) for all j: 5^/5^' = ! (1) 

1=1 

Thus, a product of simplices §^ = ^ is a support of the measure 

G Supp(/X/3). 

Together with random vectors X', we consider a "composite" random vector 
X = {Xi, . . . , Xj): on the first step, one randomly selects a vector of probabil- 
ities /3 (in accordance with measure fx^), and on the second step, one randomly 
selects outcomes in accordance with (selected on the first step) probabilities (3. 

According to our definitions, the conditional probability for Xj is: 

Pv{Xj = l\P)=Pji (2) 
from which one obtains by the law of total probability 



Pr {Xj =l) = Jpv {Xj =l\P) M/3(rf/3) = J Pji lJip{d(3) 



(3) 



We need more assumptions about ^ipto derive useful properties of the model. 
One reasonable assumption is "local independence" : 

(Gl) Conditional on value of /3, random variables Xi, . . . ,Xj are mutually in- 
dependent, i.e. for every £ € £P 



Pr /\ X,=e,\l3\= n = ^3 I /?) (4) 

A motivation for such assumption is that all "randomness" in X\ , . . . , X^j 
comes from errors in measurements, and error in one measurement does not 
depend on error in another one. Further, "conditional on value of parameters" 
means that we are considering a group of individuals having the same values 
/3; thus, every individual in a group has the same vector of probabilities (3, and 
restriction of our random vector X to this group has the vector of probabilities 
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/3 as well; as we assumed that for every individual random variables describing 
him are independent, this should be true for a group of identical (with respect 
to our random variables) individuals. It is also wise to mention that the local 
independence assumption is used in almost all variations of latent structure 
analysis. 

With the independence assumption (Gl), © may be strengthened to: 



VeeC" : Pr ( A ^1 =^^] = [ \ n f^i^.jMdP) 



(5) 



For every ^ G let the f-moment of distribution jip be 

Mi{^ip)^ j ( n PiiAt^pidP) (6) 

In particular, we have £p-'---'-^^ — {(q, . . . , 0)}, and M(o,....o)(M/3) ~ J fJ-pidfi) = 1. 

Comparing © with we see that the ^- moment of distribution /i^ is equal 
to the probability of set of outcomes £. 

For i G jC^'^\ Mi{fj,/3) is J — \J\ order mixed moment of ^p. The set of 
moments for all ^ G does not exhaust, however, the set of all moments of 
order up to J (for example, a moment J /3ii/3i2 ^J,i3{dP) is not an ^-moment.) At 
the end of the section[Z|we shall discuss in more detail whether {Mi{^0)}e can 
determine all moments of order up to J. 

Basic statistical fact is that frequencies are consistent and efficient esti- 
mators for Mi{pLp). 

The following proposition and its corollary is an equivalent of the summation 
rule for contingency and frequency tables. 

Proposition 3.1 Let J' C J" C [1..J]. Then for every f G 6^"^ 
Proof. For every jo G J" \ J' and for every G /Z'"^ ' we have: 
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Pji, M/3(c?/3) = 



The rest of the proof is induction over the size of j" \J'.U 

Corollary 3.2 For every J C [1..J], X^^g^iJi M^di^) = 1. In particular, 
E^e£M,(/z^) = l. 

Below we consider another two (essentially equivalent) assumptions. The 
first one is that a support of is restricted to {K — l)-diniensional afline 
plane in RI-'"!. The second assumption is that there exists a random variable G 
taking values in such that there exist a linear regression of random variables 
X^,...,Xj on G. 



4 Low- dimensional distributions 

The second assumption that we consider is: 

(G2') The support of is a if- dimensional linear subspace Q of RI^I, and any 
proper subspace of Q does not support /x/j. 

We include the second clause (no proper subspace of Q supports to avoid 
degenerate cases. Any degenerate case may be considered as nondegenerate case 
for some K' < K. 

As §^ e Supp(/U/3), the intersection = QnS^ is necessarily nonempty, and 
this intersection supports ^p. In general, is {K — l)-dimensional polyhedral 
body, which has at least K vertices. Let be the {K — l)-dimensional afHne 
space spanned by Pp. 

Let A = {A^, . . . , A^} be a linear basis of Q. We also consider A as a |L| x ii" 
matrix. 



/All ••• Afi\ 
A= : 

\^JLj ■ ■ ■ ^JLjJ 



(7) 



There exists considerable freedom in choosing A. We shall exploit it by 
imposing constraints on A. The first one is: 



(Ao) For every fc, A'' € P/j. 
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Let g = (51, ... , gx) be a vector of coordinates of a point (3 € Q in basis A, 
i.e. (3 = Ylk=i9k^'^- Then (Aq) implies 

K K 

J29kX''ePp ^ 1^5fe = l (8) 

k=l fe=l 

If A and A' are two bases of Q, there exists a nondegenerate K x K matrix 
A = {a^)k'k such that A' = KA. 

Using the fact that (1, . . . , 1) is left eigenvector of matrix A corresponding 
to eigenvalue 1 if and only if every column of A sums to 1, =1, one 

easily obtains the following two propositions: 

Proposition 4.1 Let both A and A' satisfy (Aq). Then (1, . . . , 1) is left eigen- 
vector of matrix A with eigenvalue 1 . 

Proposition 4.2 Let A satisfy (Aq) and let A he a nonsingular matrix with left 
eigenvector (1, . . . , 1) with eigenvalue 1. Then A' = KA satisfies (Aq). 

If 5 is a coordinate vector oi (3 &Q m. basis A, (3 = Kg, then the coordinate 
vector of (3 in basis A' = KA is g' = A~^g. 

Remark 4.3 In matrix expressions (like (3 = Kg above,) we always assume 
that all vectors are columns. ■ 

Every choice of a basis A induces a linear map: 

ifA : ^ Q, H^{g) = ^ g^X'' (9) 

k 

Note that A is a matrix of linear map H\ with respect to basis A in Q and 
standard unit basis in . 

When the basis A satisfies (Aq), H'^^{Pij) is a unit affine plane Pg in M.^ , 
Pg = {g e I J^^gk = 1}, and P^ = H^\P/3) is a convex {K - 1)- 
dimensional polyhedron in Pg. 

The map allows us to introduce a measure fig on Pg, defined as: 

li^{B) = ii0{Ha{B)) for every Borel set B C (10) 

As Pp e Supp(^^), we have P^ £ Supp(^g). 

Thus, we can replace integration over Pff by integration over P^: 

[ mi^l3{dP) = [ <f>{HA{g))fj,^{dg) (11) 
for every measurable function (j). 
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Remark 4.4 We are trying to reflect in our notation all substantial dependen- 
cies between objects. Measure up and polyhedron P/s, of course, do not depend 
on the choice of A; thus, no index A in notation /Z/j and P/s. On the contrary, 
map H defined by jnj (and consequently polyhedron Pg and measure defined 
by l(Tn| ) substantially depends on the choice of A — so we use notation Ha, 
P^, and /ig . However, we shall drop the index A in the above notation if it is 
obvious from the context. I 

5 Linear regression hypothesis 

A random variable Xj has a finite range on which no arithmetic opera- 

tions are defined. This prevents us from considering expectation, variance, etc. 
of Xj . To cope with this problem, we associate with every Xj a random vector 
Yj taking values in M.^^ and defined as: if Xj = I, then Yj — 1/, (recall that 1; is 
a Lj-dimensional vector with l*^^ component equals 1, and all other components 
equal 0.) 

There is an important connection between distributions of Xj and Yj: if 
{Pji)i is a vector of probabilities of Xj, (3ji = Pr{Xj = I), then £{Yj) = {Pji)i 
(here and below £{■) denotes expectation.) In general, for every condition C we 
have SiYj \ C) = (Pr(Xj = l\ C)i. 

Remark 5.1 As Yj is an Lj-dimensional vector, £{Yj) is also an -dimensional 
vector. We use £,«(•) to denote m"^ component of vector expectation. ■ 

Thus, we have 

Proposition 5.2 For every j and for every condition C, (a) Si{Yj | C) > 0, 
and (b) Y.i^i{Yi\C)^l. 

Now we can formulate an alternative form of assumption (G2): 

(G2") There exists a random vector G, defined on individuals and taking values 
in R-'^, such that: 

(a) There exists a joint distribution of G and X. 

(b) Local independence assumption holds, i.e. random variables (A"i \g), 
. . . , {Xj I g) are mutually independent. 

(c) For every j, a regression of Yj on G is linear. 

(d) For any K' < K there is no random vector G' satisfying (a)-(c). 

Again, clause (d) is intended to prevent degenerate cases. 
Clause (c) means that for every j, there exist vectors (Ajj);, . . . , (Aj^)/ such 
that 

£{Y,\G^g)^[Y.9>^^))i (12) 

k 
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or, in matrix form, 



£{Y, \G^g) 




(13) 



Taking into account the relation between £{Yj) and probability distribution 
of Xj, one obtains: 

Theorem 5.3 (G^ ) holds if, and only if, (G2") holds. 

The random vector G, if it exists, is not defined uniquely: for every non- 
degenerate K X K matrix A, random vector G' ~ A^^G also satisfies (G2"), 
as: 



£{Yj I G' = g') ^ £(Y,- I AG' = Ag') = £{Yj \ G = Ag') = 

A,-{A-g')^{A,-A)-g'^^-g' (14) 



This nonuniqueness corresponds to the nonuniqueness of the basis for Q 
discussed in section^ Again, one may choose G in such a way that (Aq) is 
satisfied. 

Corollary 5.4 In presence o/(Ao), the possible values of G satisfy ^f.gk = 1- 
In other words, G takes values in a unit affine plane Pg C . 

Corollary 5.5 In presence of (Aq), a set of possible values of G is a bounded 
polyhedron Pg ^ Pg- 

We are primarily interested in what can be said about value of G given 
outcomes of Xi, . . . ,Xj. The most interesting values are £{G \ X — £) and 
V{G \ X ^ £) (were T>{-) denotes variance.) We shall derive equations for these 
values in the next section. 

6 Relations between iip and jig 

As (G2') and (G2") are equivalent, we refer to (either of) them as (G2). 

Under condition (G2) we have two distributions, iip and /i^ , connected by 
(O and pOfl . In this section we establish further relations between /i^ and /ig . 

Throughout this section, we assume that some basis A of Q is fixed. We 
drop index A in all notation; however, the reader has to keep in mind that 
distribution /ig, as well as all its moments, depend on A. 
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6.1 Unconditional moments 

r '7-1 

We can express ^-moments of via moments of /ig. Let J C [1.. J] and £ G £ . 
Then: 



I ( n P^^. ) (^Z^) = / ( n E 9^^% ) (dg) = 



E h^-(A's) • n ) (15) 



Here w'-^' = {(wi, ...,wj)\wje [1..K] if j ^ J, Wj = if j G J}, and for 

M»(m<,)= / ( n 9^o^^ig{dg) (16) 

is a (J — 1^1)*^ order mixed moment of measure /ig . 

Note that W = ^j. Thus, we freely apply to W aU notations and 

conventions developed for C in section [2.31 

[7-1 

The sets of indices W are redundant in the sense that different elements 
of W correspond to the same moments. However, W has the following 
nice property: 

Proposition 6.1 Let 3' C J" C [1..J]. T/ien /or every w" G w'-^"' 
A^«,"(Ais)= E 

Proof. Similar to the proof of proposition l3.1l H 

Corollary 6.2 For every J C [1..J], X^^ugw'"^' Mw{fig) = 1. In particular, 

To handle redundancy of W, we introduce a new set of indices. 
Let V[J',K'] = {{vi,...,VK') I Vk e [0..J'] and J^k^^ = J'}- We write 
V[J'] instead of V[J', iiT] and V instead of V[J, X]. 
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For every J C [1..J] and for every v G V[J- \J\], let wF' = {w G w'"^' | for 
every A:, w contains exactly Vk components equal k}. Let also Ci = \Wv \- 

Proposition 6.3 

(a) \v\.r K']\ - i^l^±^^—^ (h) c^^}-i£^m 

Proof, (a) By induction over J' + K' from a recurrent equality |V[J',X']| = 
\V[J' ~1,K']\ + \V[J',K' 

(b) Let J' = J — \J\. By direct computation one obtains: 

\VlJ \ V2 J V 

from which the statement of the proposition is straightforward. I 

One corollary to proposition lfi . 31 is that clf^ depends on J only through \J\, 
and this value is contained in index v; thus, we can safely drop index [J] and 
write just C^. 

r -Ti 

As for every 'w,w' G Wi we have Mw{fJ,g) = Mi^i{iig) for every measure 
yUg, we can define v-moments of a measure fig as 

M.(m,) = = [ Wgl^'figidg) (17) 

k 

and normalized v-moments of a measure fig as 

MviP^g) = ^ Mw{flg) = CyA'U„{fLg) = 

/ ligl'-f^aidg) = aM.(M<,) (18) 

In both equations, Wq is an arbitrary element of W^. Note that both My (fig) 
and My(fig) do not depend on v7. 

V[J'] is the smallest possible set of indices for J'-order mixed moments of 
fig. Multiplier in lfTH|) allows us to obtain 

Proposition 6.4 For every fXg, 

My{flg) = CyMyifIg) = 1 

vev[J'] veviJ'] 



Proof. Follows from proposition 16. II 
Now we can continue (|15|) : 



Grade of Membership Analysis: Approach to Foundations 



13 



E U^M- E n^;^)= E (^^^(A^.)- E n^;^) 



E (m.(a.^)-A(J,z;,£)) (19) 

t)6V[.7'] 



where J' = J — \ J\ and 



A( J, E EKl (20) 

6.2 Conditional moments 

For the joint distribution of X = {Xi, . . . , Xj) and G we have, on the one hand, 

dPr(G = g A X ^) = Pr{X ^ i \ G ^ g) ■ dPr(G = .g) = 

{I[T.9k^k)^'aidg) (21) 

and, on the other hand, 

(iPr(G = g A X = ^) = dPr(G = g | X £) • Pr(X = £) = 

dFT{G ^ g \ X = e) ■ Mii^ifs) (22) 

Combining H21|l and H22|l . one obtains 

dPr(G^,|X^.)^nz|^,^(,,) (23) 
Similarly, for every J C [1..J] and for every £ G >C , 

dPr(G = g\X = i)^ (24) 

where for i e , X = £ means Xj = ij. 

This allows us to conclude that the conditional distribution of G | X = £ is 
absolutely continuous with respect to measure /Zg, and 
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is its probability density function. 

r -7-1 

Having this, we may write (for every J', v G V[J'], J C [1..J], £ E C ) a 
i;-order mixed moment of G conditional on X — £ 

S{G- \X = £)^ J 5X5) ^lg{dg) (26) 

where g'" denotes Hfc 9k'' ■ 

A special case of equation for u = (0, . . . , 0) is 



^(G(o,...,o) I xiJ] = £) = y g^°'-'°^Mg) t,,{dg) = j p,{g) f,g{dg) = 1 (27) 
Using equation (|26|l . we may obtain for every j € J and every I e [l..Lj]: 



k k •' 



^ [Ek 9k\]l) Uj'^J Ek fffe^^'^y ^ 

9 77 7 — ^ Mff(c^.9) 



By multiplying both sides of by M^{pLp) one obtains: 



£{G'' \ X = £ + lj) (28) 



E A,', • ■ £{G^+^^ \ X^£))^ M,+i^ (/i^) • SiG" \X = £ + l,) (29) 

k 

Equation H29|) is the main fact that allows us to establish a numerical proce- 
dure to estimate conditional expectations. This equation holds for every J' > 0, 
V e V[J'], J C [1..J], £ e C^^\ j G J, and / G [l..Lj]. 

Although equation 129|l holds for every J' and J, the most important case 
is J' + \J\ < J: as we shall see in section |3 only conditional moments £{G^ 
X = £), V e V[J'], £ G C^'^\ with J' + | J| < J may be identified from data. 



6.3 Conditional variance 

To make use of conditional expectations, one would like to know variance of G 
conditional on outcomes of measurements. It is not hard to express variance via 
conditional moments: 
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£k{G \ X ^ £)Y p,{g) ^igidg) ^ 

£{G'^>'\X^l)-8^{G^>'\X^e) (30) 



As we shall show below, £{G'^'' \ X = £) can be identified only for £ having at 
least two components equal 0; thus, the same condition applies to identifiability 

of p(G \x = e). 

6.4 Change of basis 

Let A' = A^ be another basis of Q. Here A is nonsingular K x K matrix. 



(-a\ 



(31) 



K 



''Kl 



and A 

\a\ ... a'^j 

As it was mentioned above, if a vector (3 & Q has coordinates g in basis 
A, then it has coordinates g' = A~^g in basis A'. Thus, A""^ is a matrix of 
transition from coordinates g to coordinates g' . A question of interest is how 
the moments of G are changed under this transition. 

We start with moments for w € W. Let be a moment calculated in 
coordinates g'. Then: 



j ("fciSi + • • • + a^5/f ) • • ■ {a\,gi + ■■■ + af.gK) ^J-g{dg) = 

K K 

mi — 1 mj — 1 

E E -<:----<fM^m.,...,m.,) (32) 

m-i — 1 mj — 1 

which suggests that {Mw}wew is a covariant tensor of rank J. Employing 
Einstein's convention for summation, (|32|l may be rewritten, 

Mlkr,...,kj) = • ■ • a^f M(rnu-,mj) (33) 

Tensor {Mw}wew is symmetric, and {My}y^y is a set of its essential com- 
ponents (as for any w G v, — My.) Transformation rules for My have 
form: 
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For the general case of conditional moments of arbitrary order, one obtains 

£{G'^ I X = £) = ^ ^ ( ^ ^ a™i . . . -aZ'',: J £{G"' \ X ^ £) (35) 

Here v e V[J'] for some J', v' ranges over V[J'], w and w' are restricted to the 
set >V[J'] = {(wi, . . . ,wji) I Wj e [l.-if]}, and w £ v means "for every k, w 
contains exactly Vk components equal to fc." 

7 Main system of equations 

Consider a system of equations, 

jc [1..J] : \j\ > J', eeC^'^\ 
jeJ, le [1..L,] (36) 

,I]t,ev[,/'] '^cc-.-^o) = 1' ^ [0..J] 

with respect to unknowns aji and /i^. 

Equations H29|) and 12711 together with proposition |^| give us 

Theorem 7.1 Let {Af^j^g^o 6e a set of (-moments of distribution fip, which 
satisfies (Gl) and (G2). Let also {A'^'j^, A*^ = be some basis of the support 

of fi/3, and £{G'" \ X = t) he conditional moments calculated with respect to this 
basis. 

Then a^i = Aj"; and hj = • i?(G" \ X = (,) give a solution of system l^Sh]) . 

In other words, all values we are interested in are solutions of system (I36|) . 
Below we establish sufficient conditions for the case when H36|l has only such 
solutions. 

For the sake of convenience, we (abusing language) shall speak about "solu- 
tion a^, . . . , a^" , having in mind "there exist such that a^, . . . , together 
with K"^ compose a solution." 

Let a^, . . . , a^, a'^ = (a^;)j7, and be a solution of Let a'^, . . . , a'^ 
be any set of vectors such that Lin(Q;'^, . . . , a'^) = Lin(Q;^, . . . , a^). In this case 
there exist a nonsingular K x K matrix A = (a^ )k'k such that (a'^, . . . , a'^) = 
{a^, . . . ,a^)A. Let A^^ = (a^ )k'k- By straightforward computation one can 
show that a'^ , ■ ■ ■ , cn'^ together with 



also is a solution of Ip!^ . 
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Thus, we can speak about space of solutions Lin(a^, . . . ,a^). Note that 
at this point we have no arguments for uniqueness of the space of solutions; 
moreover, we cannot even claim that every space of solutions have the same 
dimension K. In fact, in general case space of solutions is not unique. However, 
in presence of sufficient conditions that we establish below, the space of solutions 
is unique. 

Consider equations from the first group of (|36|l for u = (0, . . . , 0) and £ = 
(0,...,0): 

{E.«>(o,...,o) = <^-'°^ ie[i..J], ie[i..L,] (37) 

and substitute values for ' from the second group of (|36(l : 

{Efc4^(o',...,o)=^'.' l^[^-Lj] (38) 

As h}j: do not depend on j and I, we obtain 



Proposition 7.2 {Mi.)ji E Lin(Q!^, . . . , o;'^) for every solution a^,...,a^ of 

In other words, vector {Mi.)ji belongs to every space of solutions. 
Applying similar considerations to the case t = I'j, for some j' G [1..J], 
V G [l..i_,'], we obtain: 

lEfe c^M^ - A/r.,+i, , 3 + 3', I e [1-^,] (39) 

In system ()39|l we have equations not for all j, I but only for those in which 
j ^ j'- Thus, does not give us a vector from a solution space. However, it 
allows us to claim that for every j' , I' , a vector (Afj' )ji ■ j^ji (having '^j-^ji Lj 
components) may be extended (by adding Lji components) to a |L|-dimensional 
vector that belongs to Lin(a^, . . . , a^). 

In general, for every ^ G -C'^ \ £ we have: 

{Ek^M^l' ^M,+i^, e,^0, lE[l..L,] (40) 

and thus we obtain further incomplete vectors that may be completed to vectors 
belonging to Lin(a^, . . . , a^). 

Let us write vector (Mj^ )ji together with incomplete vectors (Mj'^_|_j^. )ji j^ji , 
etc., as columns of a matrix, with places for which we do not have moments filled 
by question marks. We refer to this incomplete matrix as to moment matrix. 
The moment matrix contains a column for every I E \ C. Figure Ogives an 
example of (part of) a moment matrix for the case J = i, Li = L2 = — 2. 
Columns in this matrix correspond to I = (000), (100), (200), (010), (020), 
(001), (002), (110); other columns are not shown. 

For a moment matrix M let its completion M be a matrix obtained from M 
by replacing question marks by arbitrary numbers. The above considerations 
give us 
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7 


? 


M(no) 


^(120) 


M(ioi) 


-^^(102) 


? 




-^-^(200) 


? 


? 


^(210) 


^(220) 


-^^(201) 


-^^(202) 


? 




-^^(010) 




M(210) 


? 


? 


A'/(oii) 


M(oi2) 


? 




^(020) 




M(220) 


? 


? 


^(021) 


M(022) 


? 




^(001) 


A^(ioi) 


M(201) 


M(oii) 


A^(021) 


? 


? 


M(ni) • 




y*^(002) 


^-^(102) 


M(202) 


^^(012) 


^(022) 


? 


? 


*^(112) • 





Figure 1: Example of moment matrix 

Theorem 7.3 Let distribution satisfy (Gl) and (G2). Then its moment 
matrix has a completion M such that rank(Af) < K . 

One may extend definition of rank to incomplete matrices by setting it equal 
to the maximal size of nonzero minor, which contains only known moments (i.e. 
does not contain question marks.) It is easy to see that for every completion M 
of M, inequality rank(M) < rank(M) holds. Thus, 

Corollary 7.4 Let distribution /i^ satisfy (Gl) and (G2). Then ra.nk{M) < K. 

For /C C £0 \ £, let M[JC\ denote a matrix consisting of those columns of 
moment matrix M that correspond to elements of /C. 

Now we are ready to formulate the third assumption regarding distribution 

(G3) There exist a subset of column indices /C C \ £ such that: 

(a) For every two completions of moment matrix M' and M" satisfying 
rank(M') < K and rank(M") < K, the equality M'[JC\ = M"[JC\ 
holds. 

(b) Let M be any completion of moment matrix satisfying rank(M) < K. 
Then rank(Af[/C]) = K. 

Note that when (G3) holds, M[1C\ is uniquely defined. 

Theorem 7.5 Let distribution /i^ satisfy (Gl), (G2), and (G3). Then for 
every solution of system f5'6|) Lin(a"'^, . . . , a^) = Lin(Af[/C]) (where Lin(M[/C]) 
is a linear subspace o/M'^' spanned by columns of M[IC].) 

Proof. By theorem 17.31 for every solution of 136|l there exists a completion M' 
of M such that Lin(M') C Lin(a\ . . . , a^). Then rank(M') = dim(Lin(M')) < 
dim(Lin(ai, . . . ,a^)) < K. Thus, by (G3), M'[IC] = M[IC], and consequently 
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Lin(M[/C]) C Lin(M'). _As dim(Lin(M[/C])) = rank(M[/C]) = K, we obtain 
Lin(a\...,a^') = Lin(M). ■ 

Corollary 7.6 Let distribution fip satisfy (Gl), (G2), and (G3). Then: 

(a) To obtain a solution of iS6]} . it is enough to take a^,...,a^ equal to 
any basis of M[!C\ (e.g., equal to any K linearly independent columns of 
M[K].) 

(b) Any other solution a'^ , . . . ,a'^ is obtained from the above one by multi- 
plying it by nonsingular K x K matrix. 

(c) Every solution a^, . . . , is a basis of Q, a support of fip. 

Proof, (a) and (b) are obvious. 

To prove (c), consider that by theorem 17. II every basis of Q is a solution of 
H36I) . By (b), all solutions are bases of the same linear subspace of E'-^L Thus, 
every solution is basis of Q. ■ 

By theorem 17.51 and its corollary, assumption (G3) is sufficient to identify 
a support of ^p. It looks like it is close to a necessary condition, as in many 
cases where (G3) is violated, we were able to construct a different distribution 
/x^, which has the same ^-moments as up (and therefore yit^Jj is indistinguishable 
from jip based on available observations.) However, the exact formulation of 
necessary conditions for identifiability of support of /i/j is an open question. 

To verify whether condition (G3) holds, it is enough to analyze the moment 
matrix. Numerous practical methods might be suggested to do such verification. 
Without going into details, we demonstrate by example one possibility. 

Example 7.7 Consider a case J = 3, Li = ^2 = ^3 = 2; thus, RI^I = R^. 
Consider a distribution /i^ concentrated in three points, Z?'-^-', Z?*-^-', and 
with every point having probability -i (see figure EJ- As /J^'^) = + ^Z?^^'' 

and /3(2),/3(3)} g Supp(/i;3), (G2) is satisfied for K = 2. 

The moment matrix M of this distribution (which corresponds to moment 
matrix on figure^ is shown on figure 13 

A submatrix of M consisting of rows 3 and 4 and columns 1 and 2 is non- 
singular, and therefore x and y such that 

column 1 • X + column2 • y = column7 

are uniquely defined; they are x = and y — ^j§q- This allows construction 
of the only possible completion of column 7, which is shown on figure [3 

Thus, column 1 and (completed) column 7 give a basis for a support of /i^. 
It is easy to see that Lin(columnl, column7) = Lm{f3^^^ , /J^^^ , /?^'^^), as one would 
expect. 

Vectors "columnl" and "completed column7" do not satisfy condition (Aq). 
To obtain a basis satisfying (Aq), one can take = columnl and — column7- 
y|. Vectors and are shown on figure [3 
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Figure 2: Illustration to example l7.7l 



(Calculations for this and subsequent examples were done with Waterloo 
Maple™ v.7.00.) ■ 

The second question is whether /i^ may be uniquely determined from 
given a solution a^,...^a^. In general, the answer is negative: not all /i^ 
may be uniquely determined. However, a number of the most important values 
always may be determined uniquely, as the following theorem shows. 

Theorem 7.8 Let a^, . . . , he a solution of 1^36]) . and let set of index pairs 
jih, ■ ■ ■ , JrIk , with Ik G [L.Lj,,], be chosen so that the matrix {aj^i^)k'k is 
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nonsingular (this is always possible as rank(Q!"'^, . . . , a^) — K.) Let Jq = 
{ill ■ • ■ i3k\ (note that \Jo\ may be less than K.) Then: 

(a) For every J such that Jq C J", for every I ^ C , and for every k G 
[1..K], the conditional expectation £{Gk \ X ~ (.) is uniquely defined. 

(b) Let, in addition, there exist ^ Jq andlo G [l..ijo] such that every KxK 
submatrix of {K + 1) x K matrix {a'^^i^)k'£[i..K].ke[o..K] is nonsingular. 
Then for every J such that U {jq} ^ ^ i for every £ G C^'^\ and for 
every k G [l-.X], the conditional variance T>{Gk \ X = £) is uniquely 
defined. 

Proof, (a) Consider a subsystem of H36|l : 

{E.'<.>^=M,+(,,),^.^,, k = l^...,K 

By theorem 17. II h\''' = Alg ■ £{Gk | X = ^) is a solution of this system, and by 
assumption of the theorem, there are no other solutions. 

(b) By part (a) of the theorem, for every ko G [1-.^^] and every k G [l..if], 
values ^J^Ijj.-) a-re uniquely determined from (|36|l . Now consider a subsystem 
of 



K 



By theoremO /ij"""^^"' = Mr £(G^'=o+^'=' \ X = t)is& solution of this system, 
and by assumption of the theorem, there are no other solutions. This is enough 
to calculate T>{Gk \ X = £) using formula (jHOfl . ■ 

Example 7.9 We continue example 17.71 Consider a subsystem of H3t)|l : 

'n-l J_n-2 - M. ^ r 7 7,(1^0) , 443^,(04) _89 

"2l"(l,0,0) "2l''-(l,0,0) ~ ^"^(l.l.O) I 15 "(1,0,0) 855 "-(1,0,0) ~ 405 

n-1 7,(1'°) J- 1^(0,1) _ n,r ' | 8^,(1,0) ,412^(0,1) _136 

^a22"(i,o,0) + "22"(i,o,0) - ^*'-^(l,2,0) l,T5"(l,0,0) + 855"(1,0,0) " 405 

Solving this system gives 

, (1,0) _ 131 (0,1) _ 76 

"(1,0,0) - 99 ' "(1,0,0) - 99 

and, as /i^ = ■ ^(G'' \ X ^ £), 

5(G(i^o) I X = (1,0,0)) = ^, £(G("^i) I X ^ (1,0,0)) ^ -| 
Considering similar subsystems, one obtains, in particular. 
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(1,0) _ 3089 (0,1) ___2641 (i,o) _ _841_ (o,i) _ 133 



(1,04) 2970' (i^O'i) 3960' (^-O'^) 2970' (I'O-^) 3950 
Substituting these values into subsystems, 



"3l''-(l,0,0) ^ "3l"-(l,0,0) ~ (1,0,1) I "3l"'(l,0,0) ^ "3l"'(l,0,0) ~ (1,0,1) 



^"32"(i,o,0) ^ "32''-(i,o,0) — "(1,0,2) I, "32''-(i,o,0) + "32''-(i,o,0) " '''(1,0,2) 

one finds, 

(2,0) _ 3323 (1,1) __7087 (0.2) _ 1895 

(1.0.0) 726 ' (I'O.o) 2178' (1.0,0) 726 

and thus, 



f(G(^-o) I X = (1,0,0)) = f(G(°-^) I X = (1,0,0)) = ^ 

This allows us calculate conditional variances (using formula (j^ ): 

IX = (1.0.0)) = IX = (1.0.0))= 

Table n summarize conditional expectations and conditional variances that 
may be calculated in our example. Although all values are exact rational num- 
bers, we used decimal notation to make comparison of values easier. We also 
put standard deviations in the table instead of variances. 

As we have mentioned, there are many choices for basis for the support of 
distribution /it^. Another possibility is to take {/?(!-', /9(2)} as a basis. The result 
of calculations in this basis is given in table 13 One can see that, although 
numbers are different, their relative position remains the same. ■ 



Remark 7.10 The standard deviations in the above example are relatively 
large. This is direct consequence of the fact that in this example we have too 
small number of measurements. When number of measuremnents increases, the 
standard deviation becomes smaller and smaller. ■ 

Remark 7.11 Theorem l7.8l guarantees that it is always possible to find J — K 
measurements such that expectations of G conditional on outcomes of these 
measurements may be uniquely determined from the system (|36|l . The possi- 
bility of determining conditional variances is not guaranteed by this theorem, 
however. In many practical cases that we have investigated, conditions of the 
part (b) of theorem 17.81 are satisfied, and conditional variances can be found 
(as in example 17.91 ) The exact conditions for determinability of conditional 
variances is an open question. ■ 
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Table 1: Conditional expectations and standard deviations calculated in basis 



£ 


£{Gi \X^£) 


a(Gi \X = £) 


£{G2 1 X^£) 


(t(G2 I X ^£) 


(1,0,0) 


2.3818 


1.6018 


-1.3818 


1.6018 


(2,0,0) 


-0.7273 


1.2214 


1.7273 


1.2214 


(0,1,0) 


0.5065 


2.0571 


0.4935 


2.0571 


(0,2,0) 


1.4318 


2.0709 


-0.4318 


2.0709 


(0,0,1) 


1.9048 


1.9122 


-0.9048 


1.9122 


(0,0,2) 


0.0000 


1.8642 


1.0000 


1.8642 



Table 2: Conditional expectations and standard deviations calculated in basis 



£ 


£{Gi \ X = £) 


^7(Gi \X^£) 


£iG2 1 X^£) 


(j{G2 1 X = £) 


(1,0,0) 


0.7667 


0.3091 


0.2333 


0.3091 


(2,0,0) 


0.1667 


0.2357 


0.8333 


0.2357 


(0,1,0) 


0.4048 


0.3970 


0.5952 


0.3970 


(0,2,0) 


0.5833 


0.3997 


0.4167 


0.3997 


(0,0,1) 


0.6746 


0.3690 


0.3254 


0.3690 


(0,0,2) 


0.3070 


0.3598 


0.6930 


0.3598 



Remark 7.12 By computations similar to used in H15(l and (|19|l . one obtains 
for every family of J' < J index pairs jih, . . . with Ip G [L.ijp] (jp is not 

necessarily different from jpi for p ^ p') 

/ Pjih ■ ■■■■ Pjj'ij, fJ'didP) ^ M„{pg) ■ A(w, . . 

where A('!;, ji, /i, . . . , jj/, Zj/) depends only on A*';. Thus, if the system H36|l 
allows unique determination of all unknowns /i^, all moments of order up to J 
of pp can be identified. This is the case, for instance, in the example 17.91 

We do not know now whether there exist some regular conditions under 
which the system (I36|l has a unique solution (modulo change of basis.) Examples 
that we have considered suggest that in a regular case system (|36() never has a 
unique solution whenever K > Lj at least for one j. (However, as theorem 17.81 
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shows, many values of interest always may be uniquely determined.) The exact 
description of parameters that may be uniquely identified based on system (I36|) . 
and to what degree the freedom in choosing other parameters may be reduced, 
is a subject for further investigation. ■ 

8 Numerical procedure 

We have established a number of precise relations between values of interest (i.e. 
expectation and variance of hidden random vector G conditional on outcomes of 
measurements) and moments of (unknown) distribution ^p, which are directly 
estimable from observations. The most important of these relations are given 
by equations H29|) . and by system of equations (|36|) . This relations suggest a 
numerical procedure for estimation of values of interest. 

As was mentioned above, sample frequences fi are consistent estimators for 
moments Mi(fj,p). Thus, applying the least squares method to the system 

J' e [0..J- 1], V e V[J'], 

JC [1..J] : \J\ > J\ £eC^'^\ 

jeJ, le [i..Lj] (41) 
J' e [0..J] 

one obtains consistent estimators for a basis {A'^jj. and conditonal expectations 
of G. 

The consistency of estimators obtained from H41(l is almost straightforward 
corollary to consistency of estimators fe- The rate of convergence is more deli- 
cate question (as a rate of convergence of fe depends on £,) and deserves separate 
investigation. 

Theorem 17.51 suggests another, two-step way for finding solutions of (|HT) . 
On the first step, one finds a basis from frequency matrix (i.e. moment matrix 
with frequences substituted for moments.) After basis is obtained, turns 
to be a linear system with respect to /i^. This way requires significantly less 
computations, but its convergence properties have to be more carefully investi- 
gated. 

One question regarding numerical procedure is the choice of value of K for 
which system (|^ should be solved. Theorem 17. 31 and its corollary suggest that 
one has to take K equal to the rank of the frequency matrix (modulo possible 
deviations of frequencies from the true moments.) 

Another question is how a numerical algorithm has to deal with is nonunique- 
ness of basis {A'^jfc. In general, there are degrees of freedom in choice of 
a basis. Imposing condition (Aq) reduces this number to K{K — 1). One can 
consider additional restrictions on choice of basis: 

(Ai) For every fc, unconditional expectation £k{G) equals j^. 



l(0,...,0) _ r 

hi - ji 



Grade of Membership Analysis: Approach to Foundations 



25 



(A2) The map Ha is isometry of Pg and Pp (with respect to euchdean distance.) 

The firts one corresponds to restricting transformations of Pg, described by 
matrix A (introduced in section^J to those having the "center" point (-^, . . . , ■^) 
of Pg fixed. The second restriction guarantees that variances do not depend on 
the choice of basis, and variances calculated in g-sp&ce coincide with variances 
calculated in /3-space. 

Imposing similar restrictions based on higher order moments, one might fully 
eliminate nonuniqueness. 

Estimation of variances is another source of problems. Formula (|30() is of 
theoretical importance, as it demonstrates that we have enough information to 
estimate variances. However, it hardly can be used for numerical computations 
as it involves differences of values that we can only approximately estimate. We 
are working on finding a better way to estimate variances. 

9 Conclusion 

We developed a novel approach to analysis of categorical data based on consid- 
ering distribution laws of observed random variables as realizations of another 
random variable. This starting point leads to a fruitful development. 

In the present article, we were able to obtain system of equations (|36|l and 
establish its properties in theorems I7.1H7.8I This provides a base for an effi- 
cient numerical procedure that gives (one form of) an answer to General GoM 
Problem. 

We also believe that the approach in general, and our results regarding sys- 
tem (|36|l in particular, may be successfully applied in other domains of statistics, 
especially in latent structure analysis. 
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